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Introduction  and  Motivation: 


We  have  been  studying  [6],  [7]  an  integer-only  arithmetic  called  the  Residue  Number  System. 
The  consequences  of  RNS  being  an  integer-only  arithmetic  include  the  inability  to  do  division, 
and  square-root  operations.  The  applications  that  are  very  amenable  to  integer-only  arithmetic, 
are  the  basic  signal  processing  functions  that  are  multiply-accumulate  intensive,  such  as  FIR 
(Finite  Impulse  Response)  digital  filters,  convolution,  correlation,  FFTs  (Fast  Fourier  Transforms). 
These  operations  have  no  feedback  and  hence  the  required  dynamic  range  is  finite,  and  can  easily 
be  predicted  for  worst  case  growth.  The  processor  required  to  perform  these  functions  must  be 
able  to  handle  the  worst  case  growth.  If  it  is  expected  that  the  computed  results  will  exceed  the 
processors  dynamic  range  (either  before  processing  starts  or  continually  checked  during 
processing),  scaling  will  be  required  so  that  overflow  will  be  avoided.  Overflow  will  result  in 
unrecoverable  errors  in  the  output  yielding  meaningless  solutions. 


In  past  research,  we  have  found,  through  simulation  and  analysis,  that  the  Residue  Number 
System  arithmetic  can  provide  up  to  8  times  speed  improvement  over  a  state-of-the-art  DSP 
(digital  signal  processor),  based  on  the  conventional  binary  integer  arithmetic,  at  the  same  clock 
speed.  This  improvement  is  based  on  multiply-accumulate-only  operations.  No  division  is 
considered.  If  division  is  part  of  the  algorithm,  the  improvement  is  less,  due  to  the  overhead  of 
RNS-to-Binary  conversion.  For  comparison  purposes,  it  was  assumed  that  the  non-RNS 
operations  were  performed  on  the  same  processor  as  the  binary  processor  under  comparison,  so 
the  degradation  is  mostly  due  to  the  RNS-to-Binary  conversion  overhead. 
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Scaling  operations  must  be  handled  very  carefully.  There  are  a  few  general  issues  that  must  be 
examined.  First,  does  the  problem  allow  scaling,  and  must  the  scale  factor  be  saved,  to  recover 
the  true  solution,  or  can  the  scale  factor  be  ignored.  For  example,  in  the  adaptive  beamformins 
problem,  Rw  =  s,  the  weight  vector  may  be  scaled  by  any  real  number  since  it  is  only  the 
relative  magnitudes  of  the  weights  that  are  of  importance.  In  this  example,  the  scale  factor  does 
not  need  to  be  saved.  The  problem  though,  is  that  too  much  scaling  will  reduce  the  precision  of 
the  solution,  resulting  in  degraded  overall  signal-to-noise  ratio. 

Another  issue  is  defining  the  difference  between  the  scale  operation  and  division,  and  the  choice 
of  the  scale  factor  to  be  used.  We  define  scaling  as  dividing  by  an  arbitrary  number,  to  decrease 
the  required  dynamic  range  to  store  the  result.  We  use  the  term  division,  when  the  divisor  is  not 
arbitrary,  but  is  a  required  for  a  particular  algorithm.  For  example,  in  linear  algebra,  there  are 
algorithms  that  require  a  normalized  vector  v/||v||.  This  division  by  the  norm  of  the  vector  is  used, 
for  example,  in  producing  sets  of  orthonormal  vectors.  If  the  division  were  replaced  by  an 
arbitrary  scale  factor,  the  algorithm  may  not  necessarily  produce  orthogonal  vectors,  which  may 
exacerbate  any.  ill-conditioning  problems. 

In  terms  of  speed,  the  scaling  operation  is  more  desirable  than  the  division  operation.  For 
instance,  in  a  binary  computer,  scaling  (division  by  2")  is  done  by  shifting  the  bits  of  a  word  to 
the  right,  n  times  -  a  simple  operation  in  binary.  Division  by  a  specific  number  requires  many 
more  operations,  including  shifts  and  subtracts.  In  the  residue  number  system,  division  requires 
conversion  of  the  RNS  representation  back  to  binary  using  the  Chinese  Remainder  Theorem, 
CRT.  (See  [11],  for  example.)  The  quotient  is  then  rounded  to  an  integer  and  converted  back  to 
the  RNS.  The  RNS-to-Binary  conversion  is  a  very  time  consuming  process  though  Binary-to-RNS 
conversion  is  more  straightforward.  On  the  other  hand,  like  binary,  scaling  is  a  little  simpler  in 
RNS.  There  are  a  variety  of  methods.  One  method  can  scale  the  number  by  one  or  more  of  the 
moduli  in  the  residue  number  system.  Another  method  based  on  the  CRT,  called  the  L-CRT, 
developed  by  researchers  at  the  University  of  Florida  [4],  The  L-CRT  scales  the  RNS  number 
by  a  fixed,  known  constant,  and  avoids  a  full  RNS-to-Binary  conversion. 

Other  researchers,  including  Westinghouse,  have  indicated  that  one  of  the  biggest  problems  with 
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implementing  Gauss  Elimination  in  RNS,  is  the  dynamic  range  issue  and  implementing  scaling 
to  overcome  it.  The  mathematical  details,  including  accuracy  tradeoffs,  were  never  presented.  We 
will  specifically  address  this  issue. 


The  following  analysis  is  to  be  used  as  background  for  the  ultimate  questions  of  whether  we  can 
use  an  RNS  processor  successfully  for  adaptive  beamforming,  ABF,  applications,  what  constraints 
are  required  for  the  processor,  and  guidelines  to  design  a  processor  for  adaptive  beamforming. 
We  will  study,  in  detail,  the  dynamic  range  requirements,  for  the  Gauss  Elimination  algorithm, 
which  can  be  used  for  adaptive  beamforming.  We  chose  to  study  Gauss  Elimination  because  of 
its  simplicity  in  implementation.  Though  there  are  other  more  robust  algorithms  that  should  be 
studied  later,  we  observe  that  for  the  covariance  matrix  formulation  of  the  ABF  problem,  the 
matrix  is  positive  definite  Heimitian  so  that  there  is  no  significant  theoretical  advantage  to  other 
methods.  The  QR  algorithm  for  the  least  squares  formulation  of  the  problem  is  numerically  stable 
but  in  its  basic  form  requires  substantial  use  of  non-RNS  operations  such  as  square-roots.  This 
approach  has  been  studied  in  terms  of  architecture  and  some  precision  analysis  by  Ward, 
Hargrave  and  McWhirter  [12]. 

For  completeness  it  is  desirable  to  say  a  little  here  about  the  adaptive  beamforming  problem.  A 
typical  beamforming  situation  is  shown  in  Figure  2.  An  array  of  N  antenna  elements  are  sampled 
at  time  k  to  form  a  complex  snapshot  vector  x^..  A  collection  of  K  of  these  snapshots  constitute 
the  NxK  iN<K)  data  matrix  X.  Inner  products  between  the  data  vector  x^  and  complex  weights  w 
form  the  complex  scalar  outputs  y*.  For  the  time  from  1  to  K,  the  output  vector  y  =  The 


problem  is  to  determine  the  weights  Wq,  w„  that  will  optimize  the  response  y  in  some 

sense.  When  it  is  necessary  to  continually  adjust  the  weights,  we  say  that  we  are  doing  adaptive 
beamforming. 

Thus  we  have 


Input  =  x(r) 


X](0 
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and  seek 


Weights  =  w 


N-\J 


A 


The  situation  is  illustrated  in  Figure  1  below. 

HGURE  1 


Array 


We  can  derive  the  optimal  weights  to  minimize  the  mean-square  error,  MSE  =  E[6‘] ,  where  the 
error  signal,  e  is  the  difference  between  the  desired  response  and  the  output  v. 

4  =  4  -  +  w^x ^x fw 

Taking  expected  values  of  both  sides  yields 

E[€^]  =  e].  = 
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E[e2]  = 

To  minimize  this  function,  we  set  the  gradient  with  respect  to  the  weight  vector  equal  to  zero, 
that  is. 

An  approximation  R  to  the  correlation  matrix  R„  (also  called  the  covariance  matrix  for  zero-mean 
data  [8])  is  fomied  from  the  NxK  data  matrix  X.  R„  is  the  complex  NxN  matrix  R^  = 
which  is  an  infinite  time  average.  Since  we  only  have  a  finite  number  K  of  snapshots,  we  use  the 
estiiriated  covariance  matrix 

R  =  XX^IK 

The  covariance  matrix  is  always  non-singular,  and  hence  R  is  a  positive  definite  Hermitian 
matrix,  since  statistically  independent  noise  exists  on  the  antenna  elements.  The  noise  correlation 
matrix  is  just  R„  =  where  o^  is  the  noise  variance  (power),  and  /  is  the  identity  matrix  of  size 
N.  That  is,  the  cross-correlation  terms  average  out  while  the  autocorrelation  terms  average  to  the 
variance  of  the  noise.  The  data  covariance  matrix  is  made  from  the  sum  of  the  signal,  jammer 
and  noise  covariance  matrices:  R  =  R^  +  Rj  +  R„. 

The  weight  vector  is  found  by  solving  the  system  Rw  =  s  where  either 

(a)  s  could  be  the  steering  vector  given  by 

where  <i>  -  {2TrdjX)sm0  and  6  is  the  desired  look-angle  with  respect  to  the  normal  to  the  linear 
antenna  array;  d  is  the  inter-element  spacing  and  X  is  the  wavelength  of  the  incoming  signal  at 
the  carrier  frequency,  or 

(b)  s  could  be  the  cross-correlation  vector 

r,.^  =  E[x,z/;]  »  {XA^IK 

where  d^  is  the  reference  signal  sampled  at  time  k,  x^.  =  (xQ,.T,,...,A'yy,_])^  is  the  snapshot  vector 
at  time  k,  and,  as  before,  E[»]  is  the  expectation  operator. 

Algorithms  which  have  been  used  for  solving  this  covariance  matrix  form  of  the  problem  include 
Gauss  elimination,  Cholesky  decomposition,  and  the  recursive  least-squares  (RLS)  method  based 
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on  the  matrix  inverse  lemma  [5],  p.385]. 

Parameters 

Principal  parameters  of  the  problem  and  the  notation  we  shall  use  for  them  are 
N  =  #  array  elements 
K  =  #  "snapshots 

q^.  =  #  bits  accuracy  required  for  weight  vector 

qx-  ^  bits  accuracy  in  data  matrix,  or  "quantization"  (also  gives  numerical  range  of  data 
matrix  entries) 

K  =  condition  number  of  covariance  matrix 

1 +L5  =  wordlength  (in  bits)  for  storage  -  equivalent  to  specifying  dynamic  range 
=  accumulator  wordlength. 

The  basic  steps  to  analyse  are: 

1 .  Formation  of  the  covariance  matrix 

2.  Forward  elimination 

3.  Back  substitution 

4.  Sensitivity  of  the  solution  to  q^. 

The  analysis  must  consider  both  range  and  precision.  Both  of  these  will  show  some  dependence 
on  the  eigenvalue  structure  and  conditioning  of  the  covariance  matrix. 

].  Formation  of  the  covariance  matrix  and  cross-correlation  vector 

Each  element  of  the  covariance  matrix  is  obtained  as  the  inner  product  of  two  complex  M-vectors 

with  components  in  the  interval  [-2^^,  2^^']  so  that  the  real  and  imaginary  components  of  each 

complex  product  are  in  the  range  [-2(2^^f,  2(2^^r]  =  2^-*'*']  which  in  turn  implies  that 

real  and  imaginary  parts  of  the  elements  of  the  scaled’  covariance  matrix  lie  in  the  interv'al 

If  the  full  integer  (scaled)  covariance  matrix  is  to  be  stored  then  it  follows  that 


I 


Noie  (hat  ihe  term  "scaled”  means  "scaled  up”  ihrtxjfihout.  The  scaling  results  from  the  omission  of  the  division  by  K. 
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^S-  log  ^  (1) 

Here  and  throughout  log  is  used  for  the  base-2  logarithm  logj.  Of  course  there  will  be  much 
greater  growth  in  the  dynamic  range  (and  so  in  the  required  value  of  during  subsequent 
stages. 

If,  on  the  other  hand,  the  actual  covariance  matrix  is  to  be  formed  there  is  a  division  of  these 
inner  products  by  K  so  that  the  range  is  reduced  by  this  factor  and  (1)  is  replaced  by  the 
requirement 

>  1  -  Iq^  (2) 

Note  that  in  the  case  where  is  a  power  of  2,  A"  =  2'’,  this  particular  division  is  equivalent  to 
using  the  final  p  bits  of  the  accumulated  sums  for  rounding  of  the  L^-bit  words.  In  this  case  there 
is  an  error  in  the  representation  which  for  a  sufficiently  large  accumulator  can  therefore  be  kept 
to  a  single  roundoff  in  the  final  division.  Specifically,  if 

>  1  +  2^;j.+  log  K  (3) 

then  the  absolute  errors  in  the  real  and  imaginary  parts  of  the  elements  of  the  covariance  matrix 
R  are  bounded  by  1/2. 

We  recall  here  some  of  the  basic  definitions  of  the  matrix  and  vector  norms  which  we  are  using 
in  this  discussion.  Although  the  dynamic  range  is  directly  related  to  the  magnitudes  of  the  real 
and  imaginary  parts  of  the  various  complex  quantities  the  analysis  is  simplified  by  using  the 
magnitudes  of  the  complex  numbers  themselves: 

For  a  complex  vector  z  =  -  the  maximum  or  «-norm  is  defined  by 

lizIL  =  max|zy|  (5) 

/ 

and  then  the  associated  matrix  norm  is  defined  as  usual  by 

MIU  =  max{|Mzt  :  ||z|L<  1}  (6) 

which  is  given  by  the  maximum  (absolute)  row  sum  of  the  matrix  A: 

MIL  =  max 52  \a.J\  (7) 

•  J 

It  follows  now  that  the  absolute  errors  in  the  elements  of  the  computed  covariance  matrix  R  are 
bounded  by  l/v/2;  that  is,  the  «>.norm  error  bound  for  R  is 
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||S/?IL  A7n/2  (8) 

Here  we  have  used  6^  to  denote  the  error  in  the  matrix  R;  corresponding  notation  will  be  used 
for  other  errors  subsequently. 

The  two  cases  -  scaled  and  unsealed  -  must  both  be  pursued  for  their  effects  on  dynamic  range 
and  accuracy  requirements.  The  bound  (8)  can  be  used  to  obtain  error  estimates  for  the  solution 
obtained  by  Gauss  elimination. 

The  formation  of  the  cross -correlation  vector  y  for  the  right-hand  side  is  similar,  with  components 
consisting  of  inner  products  between  the  snapshot  vectors  comprising  the  data  matrix  and  some 
desired  response  d.  If  we  assume  the  same  quantization  for  d  as  for  the  data  matrix,  then  (1)  and 
(3)  summarize  the  options  for  the  wordlength  in  just  the  same  way  as  above.  In  the  absence  of 
scaling,  the  division  by  K  again  introduces  an  error  which  (using  the  same  analysis  as  above) 
satisfies 

ll^ylL  <  lA/2  (9) 

Example  For  the  special  case  N  =  A,  K  =  16,  the  inequality  (1)  becomes  ^  S+lq^  for  the 

scaled  matrix.  Without  the  scaling  s  5+2^;^,  and  the  error  bound  (8)  is 

The  error  bounds  (8)  and  (9)  can  be  used  to  obtain  bounds  on  the  error  in  the  solution  of  the 
resulting  (unsealed)  linear  system  making  use  of  condition  number  estimates  which  can  be 
obtained  from  signal  strengths  [1].  The  next  short  section  summarizes  this  work  on  the 
eigenvalues  of  the  covariance  matrix. 

2.  The  eigenvalue  spread  for  the  covariance  matrix  of  adaptive  beaniforming 

Compton  ([]]  Section  4.6,  pp258-275)  studies  the  eigenvalues  and  therefore  the  condition  number 
of  the  covariance  matrix.  His  overall  findings  can  be  summarized  as  follows; 

The  eigenvalues  of  the  covariance  matrix  (normalized  relative  to  the  background  noise  level)  are 
all  greater  than  or  equal  to  unity  so  that 

X  .  >  1  (10) 

The  only  non-unit  eigenvalues  are  directly  related  to  the  powers  of  the  various  signals  -  both  the 
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desired  and  interference  signals.  In  the  case  of  a  single  jammer  of  significantly  greater  power 
than  the  desired  signal,  the  two  largest  eigenvalues  are  (approximately)  proportional  to  the 
number  of  antennas  and  the  powers  of  the  jammer  and  of  the  desired  signal  respectively. 

It  follows  that  the  condition  number  is  well-approximated  by  N  times  the  most  powerful  signal's 
(the  jammer's)  SNR.  That  is,  with  the  same  normalization  as  above: 

^max  SNRj  (11) 

It  follows  that 

K  «  A^x  SNRj  (12) 

which  for  a  jammer  of  40dB  with  N  =  4  antennas  means  that  k  =  4x10'*. 

In  the  conventional  error  analysis  for  Gaussian  elimination  of  Wilkinson  [14]  this  condition 
number  is  essentially  the  scale  factor  by  which  errors  are  magnified  during  the  solution.  However 
the  more  recent  analysis  of  Demmel  [2]  and  others  using  a  relative  «>-norm  suggests  that  the 
norm  of  the  inverse  matrix  may  be  a  better  condition  number;  and,  in  this  case,  that  is  close  to 
unity. 

The  relative  merits  of  these  two  analyses  for  the  adaptive  beamforming  problem  should  be 
investigated  further. 

Monzingo  and  Miller  [8]  consider  the  effect  of  the  eigenvalue  spread  on  the  accuracy  of  the 
solution  obtained  from  the  covariance  matrix  solution.  In  particular,  their  experimental  analysis 
shows  that  with  an  eigenvalue  spread  of  around  40  dB  and  using  a  mere  10  bit  wordlength  there 
was  a  degradation  in  the  solution  of  no  more  than  2  dB  which  corresponds  to  only  one  or  two 
significant  bits  rather  than  the  four  significant  decimal  figures  which  might  be  expected  from  the 
large  condition  number. 

The  other  relevant  aspect  of  their  work  to  the  present  discussion  centers  around  the  question  of 
the  ntimber  of  snapshots  which  are  necessary  in  order  to  achieve  acceptable  accuracy  in  the 
solution.  Their  conclusion  was  that,  for  a  3  dB  loss,  K  -  IN  was  sufficient.  It  is  anticipated  that 
this  figure  is  too  small  for  many  scenarios  and  therefore  results  here  are  presented  for  three  cases: 
NIK  =  2,  3,  4. 
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3.  Forward  elimination 

In  this  section  we  are. primarily  interested  in  the  dynamic  range  requirements  of  the  forward 

elimination  phase  of  our  solution  process.  We  shall  consider  the  solution  of  a  linear  system 

Ax  =  h  (13) 

and  denote  the  elements  of  the  positive  definite  Hermitian  matrix  A  by  a,-,  and  the  components 

of  the  right-hand-side  vector  by  b^.  The  basic  algorithm  for  the  forward  elimination  phase  of 

Gauss  elimination  can  then  be  written  in  the  i/'/t-form  as: 

Forward  elimination  algorithm 

for  i  =  1  to  N- 1 

for  j  =  j+l  to  N 

m  :=  Qj,  /  a„ 

Qji  :=  0 

bj  ’■=  bj  -  mbj 

for  ^  =  /+1  to  N 

ajk  :=  ay*  -  wa,* 

Again  there  are  two  cases  to  consider  depending  on  whether  the  covariance  matrix  is  or  is  not 
scaled. 

3.1  The  scaled  covariance  matrix 

We  assume  here  that  (1)  is  satisfied  and  so  the  scaled  (that  is,  no  division  by  K)  form  of  the 
covariance  matrix  can  be  used.  Consider  first  the  dynamic  range  growth  which  is  implicit  in  the 
Gauss  elimination  phase  of  the  solution.  Since  the  covariance  matrix  is  known  to  be  positive 
definite  Hermitian,  no  pivoting  is  used.  (See  [14],  for  example,  for  justification  of  this.) 

At  each  step  of  the  innermost  loop  of  the  process,  a  complex  2x2  matrix  is  being  reduced  as 
follows: 

a  b  (24) 

0  d  -  be!  a 

The  most  immediate  question  is  whether  the  division  can  be  incorporated  into  the  algorithm  in 
a  meaningful  way.  Now  for  our  positive  definite  Hermitian  matrix,  the  diagonal  elements  are  all 
real  and  remain  real  throughout  the  elimination  phase.  Therefore  the  divisor  a  in  (14)  is  always 
real  and,  for  our  problem  and  solution,  must  be  a  real  integer. 

The  standard  implementation  of  Gauss  elimination  as  in  the  above  algorithm  would  compute  the 
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appropriate  multiplier  for  the  current  row  and  then  to  use  this  multiplier  for  all  the  relevant 
matrix  entries.  In  the  integer  arithmetic  situation  this  is  not  possible  since  the  multiplier  would 
need  to  be  rounded  to  a  (complex)  integer  and  this  would  always  round  to  zero  if  the  off- 
diagonal  entry  was  smaller  than  the  pivot.  Since  the  largest  element  of  the  matrix  necessarily  lies 
on  the  diagonal  [3],  [13]  this  rounding  of  the  multiplier  to  zero  must  occur  at  some  point  in  the 
elimination  phase. 

The  preferred  alternative  is  therefore  to  compute  bxc  first  in  a  full  length  accumulator  and  then 
to  perform  the  rounded  division  by  a.  This  places  a  constraint  on  the  accumulator  size  in  order 
to  accommodate  the  full  components  of  each  complex  multiplication: 

L^>2Ls^\  (15) 

The  division  by  a  of  course  reduces  the  range  of  this  product.  Wilkinson's  analysis  [14]  of 
Gaussian  elimination  shows  that  for  a  positive  definite  matrix  the  growth  factor  is  1  and  therefore 
that  the  true  value  of  d-bcja  in  (14)  is  bounded  by  the  largest  element  of  the  original  matrix.  The 
"growth  factor"  referred  to  here  is  the  estimate  of  the  ratio  of  the  magnimdes  of  the  largest 
elements  in  the  original  matrix  and  the  final  upper  triangular  matrix  at  the  end  of  the  elimination 
phase. 

Now  the  division  can  be  performed  so  as  to  give  an  integer  result  which  has  an  error  no  greater 
than  1/2  and  provided  the  positive  definiteness  is  not  lost  during  the  elimination  phase  the  only 
element  growth  that  can  occur  results  from  these  rounding  errors  so  that 

max  I  SjP  I,  <  max  |  ]„  +  {N- 1)/2  (16) 

where  the  are  the  elements  of  the  matrix  at  the  kf'  stage  of  the  elimination  and  |»  |_  denotes 

the  magnitude  of  the  largest  component  of  its  complex  argument.  That  is 

|z|„  =  max{  |Re(z)|,  |Im(z)|}  (17) 

Note  funher  that  since  the  largest  element  of  A  is  necessarily  positive  and  lies  on  the  diagonal 

max  =  max  ^  (*^) 

It  follows  that  at  most  one  additional  bit  is  needed  in  I5  to  accommodate  this  growth  since  in  all 

practical  cases  we  certainly  have  N/2  <  .  (In  reality  it  is  highly  unlikely  that  this  extra  bit 

is  ever  needed  for  this  growth.) 

The  effect  of  these  errors  on  the  solution  must  of  course  be  analysed,  too.  A  larger  dynamic 
range  would  allow  fewer  division  or  scaling  operations  and  so  could  result  in  reduced  error 


NAWCADWAR-95003-4.5 


June  30,  1994  j2 

bounds. 

The  wordlength  requirements  in  (19)  for  the  left-hand  side  matrix  during  the  forward  elimination 
stage  follow  from  (1),  the  additional  bit  to  allow  for  growth,  and  (15); 

Ls>  logys:,  +  1  (19) 

The  growth  on  the  right-hand  side  is  a  more  serious  problem  since  there  is  no  correspondingly 
simple  bound  on  the  growth  factor. 

However,  we  can  obtain  bounds  for  the  growth  of  the  right-hand  side  as  follows.  Corresponding 
to  (14),  each  stage  of  the  elimination  results  in  modifications  of  the  right-hand  side  of  the  form: 

b  ^  bj  -  (20) 

‘a 

Now  the  magnitude  of  the  multiplier  here  is  bounded  by  the  largest  matrix  element,  say, 
since  the  largest  element  is  on  the  diagonal,  so  that  |  <3^  |  <  -  1  for  /  while  the  diagonal 

elements  are  positive  integers  and  so  a^>  1 .  Of  course,  as  with  the  elimination  on  the  left-hand 

side,  the  multiplier  here  would  not  be  computed  but  rather  the  multiplication  would  be  performed 
first  and  the  division  would  follow. 

Denote  by  M  the  range  available  for  the  initial  (scaled)  covariance  matrix  and  cross-correlation 
vector  given  by  the  initial  lower  bound  for  in  (1).  That  is 

\ay  \  <  M  =  (21) 

It  follows  that  the  growth  factor  for  the  right-hand  side  at  each  stage  is  bounded  by  M  so  that 

the  real  and  imaginary  parts  of  the  final  entry  are  bounded  by  which  is  to  say  the  dynamic 
range  growth  is  linear  in  the  wordlength.  This  compares  with  the  faster-than-geometric  growth 
established  for  the  divisionless  algorithm  in  [7]. 

It  follows  that  any  wordlength  satisfying 

Ls>  N{1  ^  2q^-\ogK)  (22) 

will  certainly  suffice.  (The  additional  1  is  again  to  allow  for  the  very  unlikely  overspill  resulting 
from  the  accumulated  roundoff  error  in  the  divisions.)  Such  a  wordlength  wiU  also  suffice  for 
the  accumulator  for  the  elimination  phase  since  the  final  multiplication  fits  into  this  wordlength, 
too.  The  critical  dependencies  here  are  clearly  on  N  and  Qx-  The  wordlengths  given  by  (22)  are . 
summarized  for  various  cases  in  Table  1  below.  Any  technology  has  a  maximum  practical  integer 
arithmetic  wordlength  and  this  places  a  restriction  on  the  size  of  the  adaptive  beamforming 
problem  which  can  be  solved  in  this  way  without  some  further  scaling  to  restrict  the  dynamic 
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range. 

TABLE  1 


Wordlengths  in  bits  given  by  (22) 


K^2N 

K^3N 

II 

Qx 

It 

8 

16 

4 

8 

16 

4 

8 

16 

D 

49 

225 

52 

235 

53 

113 

241 

8 

81 

169 

_ _ 1 

353 

84 

■174  1 

_ 

363 

85 

177 

_ _ _ 1 

369 

16 

145 

1 

297 

148  i 
1 

619 

149  j 

1 

625 

The  broken  line  indicates  the  boundary  of  what  can  be  achieved  with  a  256  bit  integer 
wordlength. 

It  is  apparent  that  increasing  the  ratio  K/N  is  cheap  while  increasing  either  the  number  of 
antennas  or  the  quantization  is  considerably  more  expensive  with  the  increase  in  the  number  of 
antennas  carrying  the  highest  cost  in  terms  of  desired  wordlength  for  the  elimination  phase. 
This  analysis  of  dynamic  range  assumes  nothing  about  the  magnitudes  of  the  components  of  the 
solution.  Any  knowledge  of  the  range  of  these  components  would  lead  to  much  improved 
dynamic  range  estimates.  This  will  be  considered  further  in  Section  3.3  after  we  have 
summarized  the  corresponding  analysis  to  this  for  the  unsealed  case. 


3.2  The  unsealed  case 

In  the  case  where  the  covariance  matrix  and  cross-correlation  vectors  are  not  scaled,  the  initial 
requirements  for  the  wordlengths  are  given  by  (2)  and  (3).  The  resulting  dynamic  range 
corresponding  to  (21)  will  be  denoted  by 

\a,y\  <  M'  =  2'*’^''' 

The  matrix  elements  again  have  a  growth  factor  of  1  and  so  roundoff  effects  simply  demand  one 
extra  bit.  The  growth  analysis  for  the  right-hand  side  is  precisely  analogous  to  that  of  the 

previous  section  and  leads  to  the  bound  (M  from  which  we  can  deduce  that  choosing 

>  A^(l  -2^;^.)-!  (24) 

will  suffice.  Table  2  shows  the  wordlengths  given  by  (24)  for  the  same  range  of  values  of  N,  K 
and  <7x  as  was  used  previously.  However,  in  this  case  there  is,  of  course,  no  dependence  on  K. 
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TABLE  2 


Wordlengths  in  bits  given  by  (24) 


Qx 

it 

8 

16 

4 

37 

73 

145 

r  —  —  - 

8 

69 

r~‘ 

137 

1  273 

_ 1 

16 

133  1 

-  ..,.1 

265 

529 

Again  we  see  that  the  wordlengths  grow  rapidly  with  both  N  and  Qx,  severely  limiting  the  size 
of  problem  that  can  be  handled  with  even  a  fairly  large  wordlength  integer  arithmetic. 

To  motivate  the  discussion  in  the  next  section,  consider  the  effect  of  knowing  a  bound  on  the 
weights.  If  the  magnitudes  of  the  weights  are  known  to  satisfy  some  bound  of  the  form 

^  ^  (25) 

then  the  elements  of  the  right-hand  side  satisfy 

\bj\  <  WM'  (26) 

throughout  the  elimination  phase.  A  similar  conclusion  holds  for  the  scaled  matrix  with  M  in 

place  of  M'.  We  recall  here  that  the  1-norm  i!«||,  is  defined  by 

llzll.  =  E  k/l  ■  (27) 

These  bounds  suggest  that  much  shorter  wordlengths  may  be  useable.  Information  on  the  size  of 
the  right-hand  side  vector  is  therefore  likely  to  prove  valuable  in  reducing  the  computation 
wordlength  requirements  for  the  adaptive  beamforming  problem. 

3.3  "Backward"  range  analysis 

The  intention  is  to  compute  the  weights  to  an  accuracy  of  q^.  bits.  These  weights  are  to  be 
represented  by  the  complex  integer  solution  of  the  linear  system.  We  shall  suppose  that  the 
calculation  of  the  weights  is  performed  using  a  wordlength  1  that  is,  to  an  accuracy  of 
bits  together  with  the  sign.  Typically,  we  expect  that  >  q^.  to  allow  for  some  loss  of  precision 
in  the  solution  due  to  the  various  roundoff  and  data  errors.  Z,„.  therefore  represents  the  appropriate 
computation  wordlength  from  which  weights  accurate  to  q„.  bits  will  be  extracted. 
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Nitzberg  [9]  studies  the  precision  in  the  weights  that  is  required  in  order  to  achieve  a  1  dB  or 
3  dB  degradation  as  a  function  of  the  number  N  of  antennas  and  the  power  of  the  jammer  signal. 
His  fmdings  are  that  depends  linearly  on  the  logarithms  of  both  N  and  SNRj. 

TABLE  3 

Weight  quantization,  q^.,  for  a  3dB  degradation  as  a  function  of  N  and  SNRj  from  [9] 


N 

20  dB 

30  dB 

40  dB 

50  dB 

4 

6 

9 

13 

16 

8 

7 

10 

14 

17 

16 

8 

11 

15 

18 

We  summarize  some  of  Nitzberg's  fmdings  in  Table  3.  For  a  1  dB  degradation  each  of  these 
quantization  wordlengths  should  be  increased  by  1  bit.  For  example,  with  N  -  S  and  a  40  dB 
jammer  signal  we  require  14  bits  accuracy  in  the  weights  for  a  3  dB  degradation  while  15  bits 
would  be  needed  for  1  dB.  Our  present  task  is  to  determine  from  this  requirement  the  appropriate 
and  q^  which  will  yield  this  accuracy  in  the  fmal  solution. 

In  1980,  Nitzberg  [10]  extended  his  study  to  the  question  of  how  many  bits  are  needed  for  the 
matrix  inversion  and  foimd  a  similar  linear  relation.  This  work  took  no  accoimt  of  the  formation 
of  the  sample  covariance  matrix  or  the  quantization  of  the  data.  To  some  extent  then,  the  present 
work  is  a  continuation  of  Nitzberg's  work. 

What  is  the  significance  of  this  for  our  dynamic  range  analysis?  Once  Z-„.  is  determined,  we  have 
a  bound  on  the  magnitudes  of  the  weights  and  therefore  can  obtain  the  savings  in  necessary 
wordlength  anticipated  at  the  end  of  the  previous  section.  The  specific  relation  between  L.  and 
q„  will  depend  on  the  error  analysis  which  follows. 

Denote  the  bound  on  the  weights  corresponding  to  the  wordlength  L„.  by  M„.  so  that 
From  the  eigenvalue  analysis  referred  to  in  Section  2,  it  follows,  using  (10)  that 

(29) 

where  b  is  the  unsealed  right-hand  side  vector.  Hence  using  the  wordlength  given  by  (2)  for  the 
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unsealed  matrix  it  follows  that  wordlengths  satisfying 

(30) 

will  suffice.  Now  using  the  bound  (28)  it  follows  that 

II will  <  NM^  (31) 

and  then  using  (26)  or  its  equivalent  for  the  scaled  matrix,  it  follows  that  the  modified  right-hand 
side  elements  generated  throughout  the  solution  satisfy 

\bj\  <  NMJ^'  or  \bj\<NM^.M  (32) 

With  the  wordlengths  given  by  (19)  for  the  scaled  matrix  this  yields  the  wordlength  estimate 

^  logW4  log/r^  5/2  (33) 

for  the  dynamic  range  of  the  right-hand  side.  The  corresponding  accumulator  wordlength  need 
not  be  sufficient  to  accommodate  a  full  multiplication  of  words  of  this  length  since  the  only  long 
multiplies  that  are  needed  are  between  elements  of  the  matrix  and  the  right-hand  side  vector.  It 
follows  that 


logW^  21ogA'4  4 

will  suffice. 

The  only  change  needed  for  the  unsealed  case  is  that  the  terms  in  log  K  are  not  needed: 

-  4^;^+  logW4  5/2 
A 


(34) 


(35) 


To  illustrate  the  savings  available  relative  to  the  earlier  tables,  we  show  in  Table  4  the  resulting 
wordlengths  for  the  various  jammer  strengths  used  in  Table  3,  with  the  same  combinations  of  N 
and  K  as  were  used  for  Tables  1  and  2,  with  and  I,,  =  1.5  q^.  satisfying  (30)  and  the  weight 
quantizations  of  Table  3  for  a  3  dB  degradation. 

The  linear  dependence  on  q^.  (and  hence  1^.  and  q^)  and  log  N  observed  by  Nitzberg  is  apparent 
in  these  tables.  The  magnitude  of  the  savings  which  are  obtained  from  knowing  bounds  on  the 
weights  is  also  apparent  since  all  of  these  cases  can  easily  be  accommodated  by  a  128-bit 
accumulator  whereas  many  of  them  needed  too  large  a  dynamic  range  for  256  bits  in  Tables  1 
and  2. 
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TABLE  4 

Wordlengths  q^,  Ls,  given  by  (30),  (33)  -  (35)  with  A.  =  taken  from  Table  3 


(a) 

SNR,  = 

20  dB 

A'  -  2N 

A  -  4N 

Unsealed  j 

s 

4 

8 

16 

4 

8 

16 

4 

8 

16 

4 

5 

6 

4 

5 

6 

4 

5 

6 

Is 

24 

30 

36 

25 

31 

37 

21 

26 

31 

36 

45 

54 

38 

47 

56 

30 

37 

44 

(b) 

SNR,  = 

30  dB 

A  -  2.V 

A  -  4N 

Unsealed  11 

N 

4 

8 

16 

4 

8 

16 

4 

8 

16 

7 

8 

9 

7 

8 

9 

7 

8 

9 

Ls 

36 

42 

48 

37 

43 

49 

33 

38 

43 

54 

63 

72 

56 

65 

74 

48 

55 

62 

(c) 

SNR,  * 

40  dB 

1 

A  -  4^ 

Unsealed  ■  | 

s  ■ 

4 

8 

16 

4 

8 

16 

4 

8 

16 

9x 

10 

11 

12 

10 

11 

12 

10 

11 

12 

Ls 

48 

54 

60 

49 

55 

61 

45 

50 

55 

L, 

72 

81 

90 

74 

83 

92 

66 

73 

80 

(d)  SNR,  =  50  dB 


■ 

to 

A  -  4N 

Unsealed 

N 

4 

8 

16 

4 

8 

16 

4 

8 

16 

12 

13 

14 

12 

13 

14 

12 

13 

14 

Ls 

56 

62 

68 

57 

63 

69 

53 

58 

63 

L 

84 

93 

102 

86 

95 

104 

78 

85 

92 

Clearly  the  error  analysis  will  be  important  in  determining  the  dependence  of  on  the  desired 
accuracy  which  will  in  turn  dictate  the  actual  wordlengths  that  are  needed. 
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4.  Back  substitution 

To  determine  the  dynamic  range  requirements  for  the  back  substitution  phase  of  the  solution,  we 
can  consider  the  two  cases  (the  scaled  and  unsealed  covariance  matrix)  together  since  the  rest  of 
the  analysis  is  similar  for  both.  At  this  stage  we  are  interested  in  the  solution  of  a  system 

Uw  =  b  (36) 

where  (/  is  an  upper  triangular  matrix. 

Using  the  notation  of  the  previous  section,  we  have  the  following  bounds  for  the  elements  of  this 


system: 

elements  of 

scaled 

unsealed 

U 

M 

M' 

b 

NM^M 

NM^M' 

w 

In  fact,  for  the  individual  components  of  the  right-hand  side  vector  b,  we  can  obtain  the  tighter 
bounds: 

I  b- 1  <  1  or  M')  (37) 

for  the  same  two  cases. 

The  bounds  for  the  final  weights  clearly  can  be  accommodated  in  the  same  wordlengths  that  were 
used  for  the  forward  elimination  phase.  The  only  point  of  concern  is  therefore  the  accumulation 
of  the  (modified)  right-hand  side  prior  to  each  division  in  the  standard  loop: 

Back  substitution  algorithm 

for  i  =  N-\  down  to  1 

^/-  E 

Note  again  that  each  division  operation  has  a  real  divisor  so  that  complex  division  is  avoided. 
To  see  that  the  same  accumulator  that  was  required  for  the  elimination  suffices  for  this  stage  of 
the  solution,  we  must  consider  the  right-hand  side  of  the  above  loop  operation. 

Temporarily,  we  denote  by  B  the  quantity  M^M  or  M^M'  whichever  is  appropriate  to  the  scaling 
being  used.  Since  the  final  value  of  each  w,  is  bounded  by  M^.  it  follows  that  the  final 
accumulated  value  before  the  division  is  bounded  by  B.  Its  component  parts,  using  (37),  satisfy 
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from  which 
accumulator 
accumulator, 


\b,\  <  {N^\-J)B, 


N 

E 


<  iN-j)B 


(38) 


|y=/-i  I 

it  follows  that  no  partial  result  can  exceed  (N+\-i)B  and  therefore  that  the 
lengths  derived  in  the  previous  section  suffice.  Furthermore,  with  such  an 
the  only  error  committed  is  the  rounding  in  storing  the  result  of  the  final  division. 


The  wordlengths  of  Table  4  are  suitable  for  the  complete  solution  process  for  GausS  elimination 
using  integer  arithmetic  with  divisions  in  such  a  way  that  correct  integer  results  are  accumulated 
throughout  the  process  with  correctly  rounded  integer  results  for  division.  The  purpose  of  the 
analysis  which  follows  is  to  analyse  this  particular  form  of  Gauss  elimination  to  determine  the 
data  quantization  and  arithmetic  capability  which  are  required  in  order  to  deliver  a  specified 
accuracy  in  the  final  weights.  In  the  next  section  we  consider  the  error  analysis  aspects  of  this 
question  but,  first,  we  complete  the  analysis  of  the  dynamic  range  requirements. 

Subsequently,  we  will  address  similar  questions  for  the  situation  where  some  scaling  is  necessary 
in  order  to  restrict  the  dynamic  range  to  keep  the  accumulator  size  below  some  threshold  value. 
For  example,  speed  considerations  may  dictate  that  arithmetic  wordlengths  are  kept  below  32  or 
64  bits. 

There  is,  in  fact,  an  even  smaller  bound  available  for  the  right-hand  side  than  that  given  by  (38). 
This  derives  from  the  fact  that  the  first  equation  remains  unchanged  during  the  elimination  so  that 
is  bounded  by  the  original  M  or  M'.  Similarly  the  second  element  cannot  undergo  the  full 
growth  anticipated  here  and  can  only  achieve  a  magnitude  close  to  or  In  summary ,  the 
factor  of  N  in  the  bounds  (32)  can  be  replaced  by  N-2.  However  this  represents  a  saving  of  only 
1  bit  in  the  N  =  4  case  and  even  less  than  that  for  larger  values  of  N.  For  this  reason  the  overall 
wordlengths  in  Table  4  should  be  used. 

It  is  apparent  that  the  wordlengths  required  are  much  more  moderate  than  was  predicted  in  Tables 
1  and  2  even  though  the  data  quantization  wordlengths  have  a  similar  range  to  those  used  there. 
Nonetheless,  an  accumulator  length  limit  of  even  64  bits  would  place  real  restrictions  on  the  sizes 
of  problems  to  be  solved.  A  smaller  limit  would  clearly  be  very  restrictive  without  some  scaling 
of  the  right-hand  side  vector  during  the  computation.  Only  the  right-hand  side  would  need  scaling 
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since  the  matrix  elements  are  not  subject  to  growth  except  as  a  result  of  rounding  errors.  This 
suggests  another  possiWlity;  different  quantization  for  the  data  matrix  and  the  desired  response 
to  keep  the  dynamic  range  for  the  right-hand  side  smaller.  Such  a  trade-off  may  be  considered 
later. 

5.  Error  analysis 

We  must  consider  the  effect  of  errors  from'  a  variety  of  sources: 

quantization  errors  in  the  data  matrix  and  the  desired  response, 

the  formation  of  the  covariance  matrix  and  cross-correlation  vector,  and 

rounding  errors  in  the  divisions  in  the  elimination  and  back  substitution  phases. 

Conventional  error  analyses  can  be  used  for  some  parts  of  this  but  the  fixed-point  arithmetic  with 
extended  accumulator  that  was  discussed  previously  does  not  lend  itself  immediately  to  those 
analyses  which  are  liable  to  produce  overly  pjessimistic  results  in  this  case. 

The  first-order  effect  of  the  propagation  of  the  data  errors  can  be  modeled  on  conventional 
analyses  such,  as  those  of  Wilkinson  [14],  [13]  but  the  results  which  are  included  below  for 
completeness  are  inappropriate  for  our  integer  arithmetic  if  the  errors  are  such  that  second  order 
effects  are  truly  negligible. 

Because  of  the  "integerized"  nature  of  the  data  matrix  and  desired  response,  the  real  and 
imaginary  parts  of  the  quantization  errors  are  each  bounded  by  1/2  so  that  the  error  in  any 

element  of  the  data  matrix  is  bounded  by  \l\f2.  In  this  section,  we  denote  the  computed 
covariance  matrix  and  cross-correlation  vector  by  A  and  b  respectively  to  distinguish  these  from 
their  theoretical  counterparts.  We  shall  also  denote  the  computed  solution  for  the  weights  by  w . 

First,  we  consider  the  effects  of  the  quantization  errors  on  the  computed  solution.  Here  and 
throughout  this  section  we  shall  only  consider  fust -order  effects.  The  elements  of  the  data  matrix 
are  quantized  to  complex  integers  which  (assuming  correct  rounding)  have  errors  <  1/2  in 
both  their  real  and  imaginary  parts.  Hence  the  errors  in  the  data  matrix  are  bounded  by: 

\8x.j\  <  l/v/2  (39) 

Elements  of  the  (scaled)  covariance  matrix  are  formed  from  inner  products  of  the  snapshot 


NAWCADWAR-95003-4.5 


June  30.  1994  21 

vectors.  A  product  of  two  such  numbers  u,  v  say,  computed  in  exact  integer  arithmetic  has  an 
error  bounded  by 

=  \u8v->-  v8u  +  8u8v\  <  I  ^ I  I  ^'1  <  2^'**'  (40) 

y/2 

neglecting  the  second-order  term.  It  follows  that  the  computed  elements  of  the  scaled  covariance 
matrix  have  errors  bounded  by 

|S^..|  <  (41) 

which  is  equivalent  to  the  statement  that  (at  least)  the  first  bits  of  each  element  are  correct. 
This  leads  to  a  third  natural  possible  scaling  of  the  original  problem  in  which  the  matrix  and 
right-hand  side  would  be  stored  to  this  accuracy.  This  gives  rise  to  a  modified  error  and  range 
analysis  which  will  be  considered  later. 


From  (41),  it  follows  that 
\\8A  L  < 


(42) 


and,  similarly, 

liabL  ^ 

For  the  true  (unsealed)  covariance  matrix,  the  rounding  errors  resulting  from  the  division  by  K 
is  of  similar  magnitude  to  the  already  neglected  second-order  error  term  and  so  there  are  bounds 
similar  to  those  in  (41)  -  (43)  except  that  the  factor  K  is  not  present. 

To  estimate  the  effect  of  these  errors  on  the  computed  solution,  we  use  a  first-order  analysis 
which  is  a  slight  modification  of  the  usual  Wilkinson-style  relative  error  analysis  to  this  situation. 
With  no  arithmetic  errors  during  the  solution  process,  the  computed  solution  satisfies 
{A  +  S/4  )w  =  b  +  5b  =  Aw  +  Sb 

from  which  we  obtain  the  following  error  "bound"  which  is  dependent  on  the  computed  solution: 


-I 


\L{mi 


w 


IlSbIi, 


II  w  -  w  ||„  <  I!  A 
Using  the  first-order  estimate: 

llw||„  «  ||w|l,  =  ||A'’b|U  ^  llA'‘lLllblL 
and  using  Compton's  eigenvalue  analysis,  we  see  from  (10)  that  >  1  so  that  ||A 

Substituting  this  in  (46)  and  (45),  we  obtain  the  bounds 


(45) 

(46) 

<  1. 
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||5w|L  =  iw-w|L  <  I15>I IL  iw||„  +  ||5bi„  ..j. 

^  ll5^Llb|U-||5bL 

which  indicate  that  the  error  bound  is  of  the  same  order  of  magnitude  as  the  solution  itself.  Such 
a  bound  is  not  useful. 

Wilkinson  [13]  also  includes  a  summary  of  the  corresponding  fixed-point  analysis  for  the 
situation  where  the  correct  binary  exponent  of  all  elements  of  the  solution  is  known  -  and  is 
constant  over  the  weight  vector.  Such  an  analysis  is  not  immediately  applicable  in  our  situation 
but  its  basic  principles  can  be  used  here  if  we  make  the  one  additional  assumption  that  the 
magnitudes  of  the  weights  (both  true  and  computed)  are  such  that  second  order  error  effects  can 
be  neglected.  We  shall  make  this  assumption  throughout  the  remainder  of  this  section. 

Since  we  are  computing  integer  solutions  using  longer  wordlengths  than  are  strictly  necessary  this 
is  not  much  more  than  the  assumption  that  all  weights  are  nonzero  which  in  turn  is  equivalent 
only  to  saying  that  there  is  no  redundancy  in  the  antenna  array. 

From  our  earlier  analysis,  (41)  and  (43),  we  know  that  the  maximum  error  in  elements  of  the 
(scaled)  covariance  matrix  and  cross-correlation  vector  is  bounded  by 

=:  E 

say.  There  is  a  corresponding  error  bound  for  the  unsealed  case: 

Wilkinson's  [13]  ppl  1 1-2  fixed  point  error  analysis  can  be  modified  to  our  situation  by  regarding 
the  various  integer  quantities  as  fixed  point  fractions  of  some  global  bound.  The  wordlengths 
chosen  are  then  sufficient  for  exact  accumulation  of  scalar  products  and  for  the  use  of  this  "long 
accumulator"  for  division. 

With  this  interpretation,  it  also  follows  that  the  assumption  [13]  p212  that  all  matrix  elements 
remain  bounded  by  unity  throughout  the  solution  process  is  valid  in  this  case.  This  follows  from 
the  "no-growth"  result  summarized  in  (16).  In  turn  this  implies  that  results  corresponding  to  those 
of  [13]  pp  209-11  are  valid  for  our  system  and  arithmetic. 
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From  (41),  we  have  already  observed  that  the  leading  bits  of  all  matrix  elements  are  correct. 
Neglecting  any  second-order  effects  and  recalling  that,  because  of  the  greater  accumulator  lengths 
discussed  above,  any  inner  products  and  divisions  can  be  formed  using  a  "long"  accumulator  then 
there  is  maximtim  error  in  elements  of  the  upper  triangular  factor  and  corresponding  right-hand 

side,  regarded  as  fractions^  of  (//  -  1 )  2  . 

The  dynamic  range  analysis  for  back  substitution  in  the  last  section  establishes  that  the  inner 
products  formed  during  this  phase  can  be  computed  exactly.  Compare  Section  11  of  [14]  for  the 
situation  where  the  order  of  magnitude  of  the  components  of  the  solution  is  known.  The  critical 
feature  of  that  analysis  is  that  the  magnitudes  of  the  roundoff  errors  are  then  determined  by  the 
working  precision  or  wordlength.  Although  the  corresponding  order  of  magnitude  is  neither  fixed 
nor  known  here,  the  dynamic  range  established  in  the  preceding  sections  implies  knowledge  of 
the  magnitude  of  roundoff  errors. 

Again  interpreting  all  our  integers  as  fixed-p>oint  fractions,  it  then  follows  that  the  back 
substitution  therefore  results  in  a  further  error  bounded  by  2'^*.  Neglecting  any  second-order 
effects,  it  follows  that  the  final  computed  solution  has  components  with  errors  bounded  by  NE 
or  .  Such  an  error  corresponds  to  a  further  loss  in  precision  of  at  most  log  N  bits  in  the 

real  and  imaginary  parts  of  the  weights.  Thus  we  require  that  q^  be  at  least  this  much  greater  than 
the  data  quantization  q^.,  that  is 

4  log//  (5») 

from  which  using  (30)  -  (35)  it  follows  that 
L^>  1  {q^.  log  A)  +  1 

>  4^,^  +  51ogA^-^  5/2  (+logA') 

>  6^,,,  +  7logA^+ 4  {+21ogA') 

where  the  final  parenthetic  terms  are  included  in  the  scaled  case. 

In  every  case  in  Table  3,  this  yields  a  value  for  L,,  >  1.59«,  so  that  the  wordlengths  derived  in 
Table  4  are  inadequate  for  this  process.  We  use  (50)  and  (51)  to  get  the  revised  wordlengths 
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TABLE  5 


Wordlengths  q^,  given  by  (50)  and  (51)  with  q^,  taken  from  Table  3 
(a)  SNR,  =  20  dB 


K-  2N 

A'  -  4N 

Unsealed 

N 

4 

8 

16 

4 

8 

16 

4 

8 

16 

S 

10 

12 

8 

10 

12 

8 

10 

12 

40 

50 

60 

41 

51 

61 

37 

46 

55 

L 

60 

75 

90 

62 

77 

92 

54 

67 

so 

(b) 

SNR, 

=  30  dB 

A'  -  2N 

K  -  AN 

Unsealed 

N 

A 

8 

16 

4 

8 

16 

4 

8 

16 

<ix 

11 

13 

15 

11 

13 

15 

11 

13 

15 

Is 

52 

62 

72 

53 

63 

73 

49 

58 

67 

L 

78 

93 

108 

80 

95 

no 

72 

85 

98 

(c) 

SNR, 

=  40  dB 

A'  -  2N 

A  -  AN 

Unsealed 

N 

4 

8 

16 

A 

8 

16 

4 

8 

16 

15 

17 

19 

15 

17 

19 

15 

17 

19 

Ls 

68 

78 

88 

69 

79 

89 

65 

74 

83 

L 

102 

117 

132 

104 

119 

134 

96 

109 

122 

(d) 

SNR, 

=  50  dB 

A'  -  2N 

A  -  AN 

Unsealed 

N 

4 

8 

16 

A 

8 

16 

4 

8 

16 

9» 

18 

20 

22 

18 

20 

22 

18 

20 

22 

Ls 

80 

90 

100 

81 

91 

101 

77 

86 

95 

L 

120 

135 

150 

122 

137 

152 

114 

127 

140 

We  note  that  these  wordlengths  are  such  that  even  a  128-bit  processor  is  inadequate  for  many 
problems.  This  suggests  that  some  scaling  would  be  necessary  in  order  to  keep  wordlengths  to 
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a  practical  level. 

One  natural  approach  to  this  which  should  be  pursued  is  to  take  advantage  of  the  fact  that  only 
Qx  bits  of  the  initial  matrix  and  right-hand  side  are  correct  and  so  store  only  these.  This  is 
equivalent  to  a  scaling  of  the  linear  system  which  halves  the  initial  wordlengths  from  which  the 
growth  takes  place.  Of  course  this  does  not  simply  mean  that  all  subsequent  wordlengths  are 
halved  and  both  the  dynamic  range  and  error  analysis  needs  to  be  reworked  for  this  situation. 

Conclusions 

In  this  paper,  we  have  derived  equations  to  allow  trade-offs  between  word  size  for  the 
adaptive  weights,  data  quantization,  dynamic  range  (word  length),  and  accumulator  word  size, 
for  the  conventional  Gauss  Elimination  algorithm,  using  an  integer  processor.  We  have  found  that 
a  very  large  word  length  is  required  for  a  moderately  sized  adaptive  beamforming  problem.  It 
is  obvious  that  for  large  problems,  say  greater  than  16  antenna  elements,  that  scaling  is  required 
to  keep  the  word  size  down.  This  scaling  will  degrade  the  accuracy  of  the  adaptive  weights. 

This  problem  must  be  examined  further  to  validate  the  practicality  of  using  an  integer  (e.g.  RNS) 
processor.  In  a  subsequent  paper,  we  will  study  the  trade-offs  for  the  divisionless  Gauss 
Elimination  algorithm  of  Kirsch  and  Turner  or  QR-Decomposition,  using  scaling  instead  of 
division  in  the  conventional  implementation. 
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